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AMENDMENT TO THE CLAIMS 

Please amend the claims as follows: 

1 . (Currently Amended) A system for processing an audio signal comprising: 

means for dividing the audio signal into segments, each segment representing a 
portion of the audio signal occurring In one of a succession of time intervals; 

means for detecting for each segment the presence of a fundamental frequency; 
means responsive to the detecting means for determining the voicing probability 
for each segment by computing a ratio between voiced and unvoiced components of the 
audio signal, the determining means comprising: 

means for windowing each segment of the audio signal; 
means for computing the spectrum of the windowed segment; 
means for computing correlation coefficients of each segment using at least 
the spectrum; 

means for estimatinc a voicing threshold for each segment, comprising: 
means for dividing the spectrum into a plurality of non-linear bands, wherein 

the low bands of the spectrum have a higher resolution than the high bands of the 

spectrum; 

means for eval uating at least one voice measurement for each of the 
plurality of bands: and 

means for determining the voicing threshold for each segment using the at 
Least one voice measurement; and 

means for comparing the con-elation coefficients with jja]] the voicing 
threshold for each segment; 

means for separating the signal in each segment into a voiced portion and an 
unvoiced portion on the basis of the voicing probability, wherein the voiced portion of 
the signal occupies the low end of the spectrum and the unvoiced portion of the signal 
occupies the high end of the spectrum for each segment; and 

means for separately encoding the voiced portion and the unvoiced portion of the 
audio signal , whoroin tho moana for c o paratQly - oncod i ng further inolud o c moans fo F 
computing LPG co e ffici e nts for a cpeech sogmont and mconc for transfomiing LPC 
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co e fficiontg into lino cpoctral fr e quoncieG (LSF) coofficiontG corrosponding to tho IPC 
G Q Q ffic ie nts . 

2. (Original) The system of Claim 1. wherein the audio signal is a speech signal 
and the means for determining the voicing probability further comprises means for 
refining the fundamental frequency of each segment using at least the spectrum of the 
windowed segment. 

3. (Cancelled) 

4. (Original) The system of Claim 1, wherein the means for computing the 
spectrum of the windowed segment comprises means for performing a Fast Fourier 
Transfonm (FFT) of the windowed segment 

5. (Currently Amended) The system of Claim 1, further compriofng wherein the 
means for estimating the voicing threshold for each segment comprising further 
comprises : 

moono for dividing the spectrum Into o plurality of non l inoan bondG, whore tho 
low bands of tho spoctrum hav e a highor rosolution than tho high bands of tho 
spoctrum; 

means for ova l uating at l e ast one voloo moa c urom e nt for e ach of tho plurality of 
bands, whoro tho at l east on e voico moasuromont i s th e normaiizod corro l otion 
Go e fficients -cal culat o d in th e fr e quency domain; 

means for computing the a low band energy of the spectnjm; 

means for computing an energy ratio between the energy of the high and low 
bands of the spectrum of a current segment and a previous segment; and 

a multi-layer neural network classifier for receiving th e normaiizod corr el ation 
cocfflGio fi te of tho l ow bands, the at least one voice measurement, the low band energy^ 
and the energy rati o, wherein the at least one voice measurement includes normalized 
correlation coefficients in the freouencv domain. 



377971-1 Page 3 of 18 



PAGE 5/20 * RCVD AT 8/24/2005 1 :52:47 PM [Eastern Daylight Time] * Sffl^^^ 



Aug-Z4-2005 01:40pm From-Hosar, Patterson 4 Sheridan, LLP - NJ +17325309808 T-379 • P. 005/020 F-360 

Serial No. 09/625,980 (Atty. Docket No. Aguilar 1-24-1-1 (LCNT/1 22485)) 
Reply of (Mce Action of June 24, 2005 

6. (Original) The system of Claim 1, further comprising means for spectrally 
estimating the audio signal comprising: 

means for calculating a complex spectrum for each segment by using a window 
based on the fundamental frequency; 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment. 

7. (Original) The system of Claim 6, wherein the means for calculating the complex 
spectrum comprises means for applying a Fast Fourier Transform to the windowed 
segment. 

8. (Currently Amended) A system for processing an audio signal comprising; 
means for dividing the signal into segments, each segment representing a 

portion of the audio signal in one of a succession of time intervals; 

means for detecting for each segment the presence of a fundamental frequency; 
means responsive to the detecting means for determining the voicing pTObabllity 
for each segment by computing a ratio between voiced and unvoiced components of the 
audio signal , the determininc means comprising: 

means for windowino each segment of the audio signal; 
means for com puting the soectrum of the windowed segment: 
means for computing co n-elation coefficients of each segment using at least 
the spectrum; 

means for estim ating a voicino threshold for each segment, comprising: 
means for divi ding the soectrum into a pluralltv of non-linear bands, wherein 

the low bands of the s pectrum have a higher resolution than the hioh bands of the 

spectrum: 

means for evaluating at least one voice measurement for each of the 
plurality of bands: and 

means for det ermining the voicing threshold for each segment using the at 
least one voice measurement: and 
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means for comparing the correlation coefRcients with the voicing threshold for 
each segment: 

means for calculating a complex spectrum for each segment by using a window 
based on the fundamental frequency; 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment; 

means for separating the signal in each segment into a voiced portion and an 
unvoiced portion on the basis of the voicing probability, wherein the voiced portion of 
the signal occupies the low end of the spectrum and the unvoiced portion of the signal 
occupies the high end of the spectrum for each segment; and 

means for separately encoding the voiced portion and the unvoiced portion of the 
audio signal, wherein the means for separately encoding further includes means for 
computing LPC coefficients for a speech segment and means for transforming LPC 
coefficients into line spectral frequencies (LSF) coefRcients corresponding to the LPC 
coefRcients. 

9. (Original) The system of Claim 8, wherein the audio signal is a speech signal 
and the means for determining the voicing probability comprises means for refining the 
fundamental frequency of each segment using at least the spectrum of the windowed 
segment 

10. (Cancelled) 

11. (Original) The system of Claim 8. wherein the means for computing the 
spectrum of the windowed segment comprises means for performing a Fast Fourier 
Transform (FFT) of the windowed segment. 

12. (Cancelled) 

13. (Currently Amended) the system of Claim 45 8. further compricing wherein the 
377971.1 Page 5 of 18 



PAGE 7/20 ' RCVD AT 8/2412005 1:52:4? PM [Eastern DayligW 



Aug-24-2005 01 :40pin Froni-Hosar, Patterson fi Sheridan, LLP - NJ +17325309808 T-379 P. 008/020 F-360 

Serial No, 09/625,960 (Atty. Dooket No, Aguiiar 1-24-1-1 (LCNT/1 22465)) 
Reply of Office A cWon of Jun& 24, 2005 

means for estimating the voicing threshold for each segment compricing further 
comprises : 

moonG for dividing t h o s po ctmm into a plurality of non l inear bands, whoro the 
low bands of tho opoctrum hav e a higher rcsQiution than the high bands of tho 
spootrum; 

moan s for e valuating at least ono voico moacur e m e nt for each of - thc plura l ity of 
bands, whoro th e at le ast one voioo moocur e ment i s th e normaligod corrolotion 
co e ffici e nts ca l oulatcd in tho froqu e ncy domain; 

means for computing the a low band energy of the spectrum; 

means for computing an energy ratio between the energy of the high and low 
bands of the spectrum of a current segment and a previous segment; and 

a multi-layer neural network classifier for receiving the nomrialized con-olation 
co e fflo i ents of tho low bands, the at least one voice measurement, the low band energy^ 
and the energy rati o> wherein the at least one voice measurement includes nonnalized 
correlation coefficients In the frequency domain , 

14, (Original) The system of Claim 8, wherein the means for calculating the complex 
spectrum comprises means for applying a Fast Fourier Transform to the windowed 
segment 

15. (Previously presented) A system for processing an audio signal having a 
number of frames, the system comprising: 

an encoder comprising: 

first means for determining for each frame a ratio between voiced and 
unvoiced components of the audio signal on the basis of the fundamental 
frequency of each frame, the ratio being defined as a voicing probability, the 
means for determining the voicing probability comprising: 

means for windowing each frame of the input signal; 
means for computing the spectrum of the windowed frame; 
means for computing correlation coefficients of each frame using at 
least the spectrum; and 
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means for comparing the correlation coefficients with a voicing 
threshold for each segment; 
second means for detemilning at least a pitch period, a mid-frame pitch 
period, and a mid-frame voicing probability of the audio signal; and 

means for quantizing at least the pitch period, the voicing probability, the 
mid-frame pitch period, and the mid-frame voicing probability. 

1 6. (Original) The system of Claim 1 5. further comprising a decoder comprising: 
means for unquantizing at least the pitch period, the voicing probability, the mid- 
frame pitch period, and/or the mid-frame voicing probability and providing at least one 
output; and 

means for analyzing the at least one output to produce a synthetic speech signal 
corresponding to the input audio signal. 

17. (Original) The system of Claim 15. further comprising means for estimating the 
voidng threshold for each segment comprising; 

means for dividing the spectmm into a plurality of non-linear bands, v^here the 
low bands of the spectrum have a higher resolution than the high bands of the 
spectrum; 

means for evaluating at least one voice measurement for each of the plurality of 
bands, where the at least one voice measurement is the normalized correlation 
coefficients calculated in the frequency domain; 

means for computing the low band energy of the spectrum; 
means for computing an energy ratio between the energy of the high and low bands of 
the spectrum of a current segment and a previous segment; and 

means for receiving the normalized correlation coefficients of the low bands, the 
low band energy and the energy ratio. 

18. (Original) The system of Claim 17, wherein the means for receiving is a multi- 
layer neural networi< classifier. 
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19. (Original) The system of Claim 18. wherein the voicing probability is zero if an 
output from the means for receiving is less than a predetermined threshold for a 
predetermined number of frames. 

20. (Original) The system of Claim 15, wherein further comprising means for high- 
pass filtering the audio signal and buffering the audio signal into the number of frames. 

21. (Original) The system of Claim 15, wherein the encoder further comprises 
spectral estimation means for computing an estimate of the power spectrum of the 
audio signal using a pitch adaptive window. 

22. (Original) The system of Claim 21, wherein the length of the. pitch adaptive 
window is based on the fundamental frequency of the audio signal. 

23. (Original) The system of Claim 16. wherein the means for unquantizing 
comprises: 

means for producing a spectral magnitude envelope and a minimum phase 
envelope using at least the unquantized pitch period, the unquantized voicing 
probability, the unquantized mid-frame pitch period, and/or the unquantized mid-frame 
voicing probability; 

means for interpolating and outputHng the spectral magnitude envelope and the 
minimum phase envelope to the means for analyzing: 

means for estimating the signal-to-noise ratio of the audio signal using the at 
least the unquantized pitch period, the unquantized voicing probability, the unquantized 
mid-frame pitch period, and/or the unquantized mid-frame voicing probability; and 

means for generating at least one control parameter using at least the signal-to- 
noise ratio and for outputting the at least one control parameter to the means for 
analyzing. 

24. (Original) The system of Claim 16, wherein the means for analyzing comprises: 
first means for processing the at least one output to produce a time-domain 
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signal; and 

second means for processing the time-domain signal to produce the synthetic 
speech signal corresponding to the audio signal. 

25. (Original) The system of Claim 24, wherein the first means for processing the at 
least one output to produce the time-domain signal comprises: 

means for filtering a spectral magnitude envelope, wherein the spectral 
magnitude envelope is outputted by the means for unquantizing; 

means for calculating frequencies and amplitudes using at. least the filtered 
spectral magnitude envelope; 

means for calculating sine-wave phases using at least the calculated 
frequencies; and 

means for calculating a sum of sinusoids using at least the calculated 
frequencies and amplitudes and the sine-wave phases to produce the timeHJomain 
signal, 

26. (Original) The system of Claim 1 5, further comprising: 

means for calculating a complex spectrum for each segment by using a window 
based on the fundamental frequency; and 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment. 

27. (Original) The system of Claim 26, wherein the means for calculating the 
complex spectrum comprises means for applying a Fast Fourier Transform to the 
windowed segment. 

28. (Previously presented) A system for processing an audio signal having a 
number of frames, the system comprising: 

an encoder comprising: 

means for determining for each frame a ratio between voiced and 
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unvoiced components of the audio signal on the basis of the fundamental frequency of 
each frame, the ratio being defined as a voicing probability; 

means for calculating a complex spectrum for each segment by using a 
window based on the fundamental frequency; 

means for spectrally modeling each segment using at least the complex 
spectrum, the fundamental frequency, and the voicing probability to obtain line spectral 
frequencies (LSF) coefficients and a signal gain of each segment; 

means for determining at least a pitch period, a mid-frame pitch period, 
and a mid-frame voicing probability of the audio signal; and 

means for quantizing at least the pitch period, the voicing probability, the 
mid-frame pitch period, and the mid-frame voicing probability. 

29. (Original) The system of Claim 28, further comprising a decoder comprising: 
means for unquantizing at least the pitch period, the voicing probability, the mid- 
frame pitch period, and/or the mid-frame voicing probability and providing at least one 
output; and 

means for analyzing the at least one output to produce a synthetic speech signal 
con-esponding to the input audio signal. 

30, (Original) The system of Claim 28, ftjrther comprising means for estimating the 
voicing threshold for each segment comprising: 

means for dividing the spectrum into a plurality of non-linear bands, where the 
low bands of the spectrum have a higher resolution than the high bands of the 
spectrum; 

means for evaluating at least one voice measurement for each of the plurality of 
bands, where the at least one voice measurement is the normalized correlation 
coefficients calculated (n the frequency domain; 

means for computing the low band energy of the spectrum; 

means for computing an energy ratio between the energy of the high and low 
bands of the spectrum of a current segment and a previous segment; and 

means for receiving the normalized correlation coefficients of the low bands, the 
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low band energy and the energy ratio. 

31. (Original) The system of Claim 30, wherein the means for receiving is a multi- 
layer neural network classifier. 

32. (Original) The system of Claim 31, wherein the voicing probability is zero if an 
output from the means for receiving is less than a predetermined threshold for a 
predetennined number of frames. 

33. (Original) The system of Claim 28, further comprising means for high-pass 
filtering the audio signal and buffering the audio signal into the number of frames. 

34. (Original) The system of Claim 28, wherein the encoder further comprises 
spectral estimation means for computing an estimate of the power spectrum of the 
audio signal using a pitch adaptive window. 

35. (Original) The system of Claim 34, wherein the length of the pitch adaptive 
window Is based on the fundamental frequency of the audio signal. 

36. (Original) The system of Claim 29, wherein the means for unquantizing 
comprises: 

means for producing a spectral magnitude envelope and a minimum phase 
envelope using at least the unquantized pitch period, the unquantized voicing 
probability, the unquantized mid-frame pitch period, and/or the unquantized mid-frame 
voicing probability; 

means for interpolating and outputting the spectral magnitude envelope and the 
minimum phase envelope to the means for analyzing; 

means for estimating the signal-to-noise ratio of the audio signal using the at 
least the unquantized pitch period, the unquantized voicing probability, the unquantized 
mid-frame pitch period, and/or the unquantized mid-frame voicing probability; and 

means for generating at least one control parameter using at least the signal-to- 
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noise ratio and for outputting the at least one control parameter to the means for 
analyzing. 

37. (Original) The system of Claim 29, wherein the means for analyzing comprises: 
first means for processing the at least one output to produce a time-domain 

signal; and 

second means for processing the time-domain signal to produce the synthetic 
speech signal corresponding to the audio signal. 

38, (Original) The system of Claim 37, wherein the first means for processing the at 
least one output to produce the time-domain signal comprises: 

means for filtering a spectral magnitude envelope, wherein the spectral 
magnitude envelope is outputted by the means for unquantizing; 

means for calculating frequencies and amplitudes using at least the filtered 
spectral magnitude envelope; 

means for calculating sine-wave phases using at least the calculated 
frequencies: and ' 

means for calculating a sum of sinusoids using at least the calculated 
frequencies and amplitudes and the sine-wave phases to produce the time-domain 
signal. 

39. (Original) The system of Claim 28. wherein the means for detenmining the 
voicing probability comprises: 

means for windowing each frame of the input signal; 
means for computing the spectrum of the windowed frame; 
means for computing correlation coefficients of each frame using at least the 
spectrum; and 

means for comparing the correlation coefficients with a voicing threshold for each 
segment 

40, (Original) The system of Claim 28, wherein the means for calculating the 
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complex spectrum comprises means for applying a Fast Fourier Transform to the 
windowed segment. 

41. (Withdrawn) A system for processing an audio signal having a number of 
frames, the system comprising: 

a decoder comprising: 

means for unquantizing at least a pitch period, a voicing probability, a mid-frame 
pitch period, and/or a mid-frame voicing probability of the audio signal and providing at 
least one output, where the means for unquantizing comprises means for generating at 
least one control parameter using at least the signal-to-noise ratio computed using a 
gain and the voicing probability of the audio signal; and 

means for analyzing the at least one output, including the at least one control 
parameter, to produce a synthetic speech signal corresponding to the input audio 
signal. 

42. (Withdrawn) The system of Claim 41, wherein the means for unquantizing 
comprises: 

means for producing a spectral magnitude envelope and a minimum phase 
envelope using at least the unquantized pitch period, the unquantized voicing 
probability, the unquantized mid-frame pitch period, and/or the unquantized mid-frame 
voicing probability; 

means for interpolating and outputting the spectral magnitude envelope and the 
minimum phase envelope to the means for analyzing; and 

means for estimating the signal-to-noise ratio of the audio signal using the at 
least the unquantized pitch period, the unquantized voicing probability, the unquantized 
mid-frame pitch period, and/or the unquantized mid-frame voicing probability and 
outputting the signal-to-noise ratio to the means for generating at least one control 
parameter. 

43. (Withdrawn) The system of Claim 41, wherein the means for analyzing 
comprises: 
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first means for processing the at least one output to produce a time-domain 
signal; and 

second means for processing the time-domain signal to produce the synthetic 
speech signal corresponding to the audio signal. 

44. (Withdrawn) The system of Claim 43, wherein the first means for processing the 
iat least one output to produce the time-domain signal comprises: 

means for filtering a spectral magnitude envelope, wherein the spectral 
magnitude envelope is outputted by the means for unquantizing; 

means for calculating frequencies and amplitudes using at least the filtered 
spectral magnitude envelope; 

means for calculating sine-wave phases using at least the calculated 
frequencies; and 

means for calculating a sum of sinusoids using at least the calculated 
frequencies and amplitudes and the sine-wave phases to produce the time-domain 
signal. 
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